Electric vehicles are considered to be a strong contributor to reducing carbon emissions. In this project, data on electric vehicles were selected for analysis and research.
Data were collected by the China Electric Vehicle Association and include data from March 1 to March 8, 2022 for the NIO branded electric vehicle ES6 models. We take the data as a proxy for EV across China. Variables used in this analysis mainly include
## Rows: 564,584
## Columns: 21
## $ AirCondMod <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ date <chr> "2022/3/2", "2022/3/2", "2022/3/2", "2022/3/2", "202…
## $ hour <int> 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, …
## $ region_code <int> 610703, 610722, 610723, 610725, 610726, 610729, 6108…
## $ model_type <chr> "ES6", "ES6", "ES6", "ES6", "ES6", "ES6", "ES6", "ES…
## $ vehicle_num <int> 9, 4, 1, 4, 2, 1, 1, 1, 3, 1, 2, 1, 1, 2, 1, 1, 12, …
## $ AmdTemp <dbl> 14.92, 15.13, 15.50, 15.10, 13.25, 17.00, 10.50, 10.…
## $ weather <int> 4, 4, 4, 4, 4, 4, 0, 0, 0, 9, 0, 0, 0, 9, 9, 9, 0, 0…
## $ humidity <int> 40, 40, 40, 40, 40, 40, 10, 10, 28, 15, 28, 28, 28, …
## $ pm25 <dbl> 41, 41, 41, 41, 41, 41, 18, 18, 38, 26, 38, 38, 36, …
## $ ComprActPwr_kwh <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ PtcTotActPwr_kwh <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ PwrActOfChi_kwh <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ trip_eng_kwh <dbl> 1.592, 3.175, 9.100, 2.240, 7.200, 0.000, 1.000, 0.5…
## $ IntrTemp <dbl> 22.09, 22.48, 22.58, 22.16, 22.95, 17.38, 20.86, 21.…
## $ CCU_FrntLeTempSet <dbl> 24.38, 24.63, 22.00, 25.30, 26.50, 29.00, 28.00, 31.…
## $ CCU_FrntRiTempSet <dbl> 25.12, 24.88, 22.00, 25.50, 26.50, 28.00, 28.00, 31.…
## $ CCU_FrntBlwSpd <dbl> 1.16, 1.25, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00…
## $ duration_s <dbl> 575.92, 688.75, 835.00, 835.20, 2247.00, 15.00, 1320…
## $ mileage_km <dbl> 7.65, 14.50, 30.00, 13.30, 37.60, 0.00, 4.40, 2.10, …
## $ VehSpd_kph <dbl> 38.28, 77.78, 130.05, 60.34, 68.86, 1.39, 12.79, 9.0…
## AirCondMod date hour region_code
## Min. :0.000 Length:564584 Min. : 0.00 Min. :110101
## 1st Qu.:0.000 Class :character 1st Qu.: 9.00 1st Qu.:320581
## Median :1.000 Mode :character Median :13.00 Median :360481
## Mean :1.327 Mean :13.19 Mean :362876
## 3rd Qu.:2.000 3rd Qu.:18.00 3rd Qu.:440515
## Max. :3.000 Max. :23.00 Max. :659008
##
## model_type vehicle_num AmdTemp weather
## Length:564584 Min. : 1.00 Min. :-40.00 Min. : 0.000
## Class :character 1st Qu.: 1.00 1st Qu.: 10.07 1st Qu.: 0.000
## Mode :character Median : 2.00 Median : 14.00 Median : 4.000
## Mean : 13.37 Mean : 13.99 Mean : 5.067
## 3rd Qu.: 8.00 3rd Qu.: 18.38 3rd Qu.: 9.000
## Max. :1328.00 Max. : 38.50 Max. :31.000
## NA's :11
## humidity pm25 ComprActPwr_kwh PtcTotActPwr_kwh
## Min. : 0 Min. :-100000000 Min. :0.0000 Min. :0.00000
## 1st Qu.: 32 1st Qu.: 20 1st Qu.:0.0000 1st Qu.:0.00000
## Median : 52 Median : 34 Median :0.0000 Median :0.00100
## Mean : 53 Mean : -406811 Mean :0.0257 Mean :0.03685
## 3rd Qu.: 74 3rd Qu.: 51 3rd Qu.:0.0260 3rd Qu.:0.03200
## Max. :100 Max. : 279 Max. :1.7680 Max. :4.14000
##
## PwrActOfChi_kwh trip_eng_kwh IntrTemp CCU_FrntLeTempSet
## Min. :0.000000 Min. :-76.800 Min. :-20.42 Min. :15.00
## 1st Qu.:0.000000 1st Qu.: 0.400 1st Qu.: 19.73 1st Qu.:22.60
## Median :0.000000 Median : 0.797 Median : 21.91 Median :24.00
## Mean :0.000157 Mean : 1.234 Mean : 21.42 Mean :23.90
## 3rd Qu.:0.000000 3rd Qu.: 1.400 3rd Qu.: 23.55 3rd Qu.:25.07
## Max. :0.961000 Max. : 32.000 Max. : 45.32 Max. :31.00
## NA's :6
## CCU_FrntRiTempSet CCU_FrntBlwSpd duration_s mileage_km
## Min. :15.00 Min. :1.000 Min. : 0.0 Min. :-350.400
## 1st Qu.:22.53 1st Qu.:1.570 1st Qu.: 280.0 1st Qu.: 1.100
## Median :23.92 Median :2.620 Median : 517.7 Median : 3.090
## Mean :23.83 Mean :2.733 Mean : 585.2 Mean : 4.619
## 3rd Qu.:25.00 3rd Qu.:3.500 3rd Qu.: 752.5 3rd Qu.: 6.000
## Max. :31.00 Max. :9.000 Max. :3599.0 Max. : 97.700
## NA's :6
## VehSpd_kph
## Min. : 0.00
## 1st Qu.: 15.79
## Median : 24.48
## Mean : 30.85
## 3rd Qu.: 38.00
## Max. :188.84
##
First we clear the data of missing values and re-sort the data in chronological order. 17 rows of data are omitted in the process.
## [1] "Original data rows: 564584"
## [1] "Data rows after removing missing values: 564567"
## AirCondMod date hour region_code model_type vehicle_num AmdTemp weather
## 1 0 2022/3/1 0 110101 ES6 9 6.22 1
## 2 0 2022/3/1 0 110102 ES6 8 5.94 1
## 3 0 2022/3/1 0 110105 ES6 42 6.33 1
## 4 0 2022/3/1 0 110106 ES6 8 6.11 1
## 5 0 2022/3/1 0 110107 ES6 3 5.17 1
## 6 0 2022/3/1 0 110108 ES6 24 6.80 1
## humidity pm25 ComprActPwr_kwh PtcTotActPwr_kwh PwrActOfChi_kwh trip_eng_kwh
## 1 27 4 0 0 0 0.989
## 2 27 4 0 0 0 0.437
## 3 27 4 0 0 0 1.175
## 4 33 4 0 0 0 1.156
## 5 29 4 0 0 0 0.600
## 6 30 4 0 0 0 0.800
## IntrTemp CCU_FrntLeTempSet CCU_FrntRiTempSet CCU_FrntBlwSpd duration_s
## 1 15.21 26.00 26.33 1.22 644.89
## 2 14.94 25.44 25.81 1.27 252.63
## 3 14.68 25.92 25.87 1.52 566.89
## 4 15.02 26.61 26.50 1.46 511.78
## 5 9.89 22.50 22.50 1.00 435.00
## 6 14.32 25.52 25.10 1.29 591.32
## mileage_km VehSpd_kph
## 1 3.67 33.43
## 2 2.34 35.13
## 3 2.78 33.74
## 4 -2.27 32.21
## 5 2.67 35.25
## 6 2.81 33.67
The above plot shows that in March, half of China’s drivers did not use air conditioning, suggesting that the climate was suitable on many days, in addition to scenarios in which air conditioning modes using heat pump technology for heating and using PTCs accounted for 32.6% of the demand, higher than the 17.4% share of the demand for cooling.
The above data shows that numbers of vehicle traffic show a clear tidal pattern over time, specifically, peaks at 8:00 a.m. and 6:00 p.m., respectively.
According to region_code, we filtered out the data of Beijing (central province), Shanghai (southern province), and Heilongjiang (northern province), and counted the air conditioner setting temperature of all car owners during the statistical period. It is easy to see through the above box plot that the air conditioner setting temperature of the car owners in Heilongjiang region is higher, and that in Shanghai region is lower, which is also in line with geographic laws and the climatic characteristics of each place, i.e., the climate of the northern city is colder, and that of the southern city is warmer.
We can see several characteristics of the distribution of air conditioning set temperatures:
Peak: the data distribution shows a clear single peak characteristic, mainly concentrated between 20 and 30 vehicles. This indicates that most observations are clustered around a central value.
Symmetry: The distribution appears to be relatively symmetrical, with the peaks located roughly in the center of the distribution and a more even decline on both sides.
Tail Behavior: The plot shows that the tails of the data are not very long, with a gradual decrease towards 0 on both sides, indicating that extreme values are not very common.
Kernel Density Estimation: The kernel density estimation (orange line) in the figure matches the histogram well, indicating that the kernel density estimation method used is effective in capturing the overall trend of the data.
Data fluctuations: While the data are mainly concentrated in one area, there are some data points spread outside the peak area, which may indicate some fluctuations or outliers in the data.
The above histogram shows that the majority of car owners set their air conditioners between 20 and 28 degrees, and there are three peaks at 23, 24 and 25 degrees, indicating that these are the most popular temperatures.
We can take random samples from the original data several times, calculate the sample means, and then plot the distribution of these sample means. According to the Central Limit Theorem, regardless of the distribution of the original data, as long as the sample size is large enough, the distribution of the sample means will approximate a normal distribution.
The distribution of the sample means is close to normal, thus verifying the applicability of the central limit theorem.
## [1] "Full data statistics:"
## MeanVehicle SDVehicle
## 1 13.37288 40.79825
## [1] "Simple random sample statistics:"
## MeanVehicle SDVehicle
## 1 2.9 3.695342
## [1] "Stratified sample statistics:"
## MeanVehicle SDVehicle
## 1 5.433235 21.36222
## [1] "Systematic sample statistics:"
## MeanVehicle SDVehicle
## 1 17.1 31.19633
Systematic sampling appears to provide the closest results to full sample statistics in this case, possibly because its sampling method (fixed intervals) captures the major trends in the data set. Stratified sampling also demonstrates the ability to maintain data diversity, but simple random sampling may not effectively reflect the overall statistics due to insufficient sample size or too much chance.
Although there may be randomness in the behavior of individual owners, a large amount of data shows an overall pattern, and in the course of our statistics on the air conditioning setting temperature, the number of vehicles, and other data, we found that the use of air conditioning in EVs is affected by region, while the number of vehicles shows tidal wave over time.